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The network of molecular chaperones mediates the folding and translocation of the many Received: September 7, 2012 

proteins encoded in the genome of eukaryotic organisms, as well as a response to stress. It Revised: October 29, 2012 

has been particularly well characterised in the budding yeast, Saccharomyces cerevisiae, where Accepted. November 8, 2012 

63 known chaperones have been annotated and recent affinity purification and MS/MS exper- 
iments have helped characterise the attendant network of chaperone targets to a high degree. 
In this study, we apply our QconCAT methodology to directly quantify the set of yeast chap- 
erones in absolute terms (copies per cell) via SRM MS. Firstly, we compare these to existing 
quantitative estimates of these yeast proteins, highlighting differences between approaches. 
Secondly, we cast the results into the context of the chaperone target network and show a dis- 
tinct relationship between abundance of individual chaperones and their targets. This allows 
us to characterise the 'throughput' of protein molecules passing through individual chaper- 
ones and their groups on a proteome-wide scale in an unstressed model eukaryote for the first 
time. The results demonstrate specialisations of the chaperone classes, which display different 
overall workloads, efficiencies and preference for the sub-cellular localisation of their targets. 
The novel integration of the interactome data with quantification supports re-estimates of the 
level of protein throughout going through molecular chaperones. Additionally, although chap- 
erones target fewer than 40% of annotated proteins we show that they mediate the folding of 
the majority of protein molecules (~62% of the total protein flux in the cell), highlighting their 
importance. 
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1 Introduction 

One of the goals of quantitative proteomics is to characterise 
the level of protein in a cellular system in order to under- 
stand how the gene products are organised and regulated. 
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Such understanding underpins any comprehensive systems 
model of an organism and its biological functions, where 
genome, transcriptome, proteome and metabolome interact 
with each other to maintain homeostasis or react to stress and 
perturbation. This in turn builds on the classical molecular 
biology 'dogma' coined by Crick, where 'DNA makes RNA 
makes protein', which now includes a more complex model 
involving multiple isoforms and interactions. For example, 
recent studies have characterised the entire yeast transcrip- 
tome [1], interactome [2, 3], measured translational control 
rates [4], protein locations [5] and half-lives [6]. Quantita- 
tive proteomics has targeted Saccharomyces cerevisiae ('yeast') 
as a model organism and several proteome-wide datasets 
exist [7-9]. Similarly, in mammalian cells, great strides have 
been made in the integration of transcription, translation 
and turnover of both RNA and protein to build genome-scale 
models [10]. This epitomises the challenges facing systems 
biology where integration of such information is needed to 
understand the full complexities of biological control and reg- 
ulation of function. 

Although such studies now build protein abundance and 
even half-life into the model, in order for a given protein to 
function it also needs to be folded, active, and delivered to 
its site of action. The proteins responsible for this are the 
chaperones, of which 63 are known in yeast [11]. They oper- 
ate as individual proteins or assembled into molecular ma- 
chines, to recognise their targets, promote the correct folding 
and help deliver them to their sub-cellular destination [12]. 
They play a vital role in preventing protein aggregation by 
recognising the nascent peptide chain to ensure proper fold- 
ing in a biologically meaningful timescale. Chaperones are 
also involved in other linked areas, including ribosomal RNA 
processing, translocation across membranes and cellular re- 
sponse to stress [13]. There are 63 yeast chaperones including 
the so-called heat-shock proteins: HsplOO, Hsp90, Hsp70, 
Hsp60 and the smaller HSPs that are ubiquitous in eukary- 
otic cells, and much is known about the mechanistic details of 
individual chaperones at the molecular level. However, a com- 
prehensive understanding of the cellular roles played by chap- 
erones is only just emerging. Recent pioneering work using 
affinity purification coupled to MS has defined a comprehen- 
sive qualitative dataset describing chaperone-chaperone and 
chaperone-target interactions for all 63 yeast chaperones [11] 
but we know little regarding the changes in these networks 
during stress conditions, or when recombinant protein ex- 
pression perturbs the system. 

Our understanding of chaperone networks and their prop- 
erties is emerging [11, 14, 15]. Frydman and colleagues 
demonstrated that two distinct and broad chaperones classes 
carry out different generic fundamental roles, delivered via 
common regulatory properties [14]. More recently, an analysis 
of chaperone interactome data addresses the scope of indi- 
vidual chaperone systems by clustering the chaperone-target 
network into modules that show some conserved properties, 
including evolutionary rates [16]. These modules are quite dif- 
ferent from the expected chaperone classes described above 



and strongly support the hypothesis that chaperones act in 
distinct communities, targeted at selected protein groups. 

Here we extend the previous studies, adding further quan- 
titative data to this network via QconCAT and other extant 
quantitative datasets available in the public domain, includ- 
ing target protein degradation rates. We show that there is 
a correlation between chaperone abundance and the work- 
load each chaperone has in the yeast cell, represented by the 
number of known interactors, the abundance of their targets, 
and the estimated folding flux. We also consider the total flux 
through each chaperone (and chaperone group) and consider 
this in the context of annotated biological function. We dis- 
cuss this in terms of sub-cellular localisation and previously 
reported throughput in chaperone pathways, as well as es- 
sentiality of protein targets. This represents a first look at the 
total folding flux placed upon the chaperone network derived 
from quantitative proteomics data and highlights the global 
role they play in regulating protein folding. 

2 Materials and methods 

2.1 Yeast and QconCAT sample preparation 

The QconCAT proteins were produced as previously de- 
scribed [17] using cell lysis by sonication and purified by 
Ni-MAC nickel affinity column (Novagen, Merck Millipore). 

S. cerevisiae (EUROSCARF accession number Y11335 
BY4742; Mat ALPHA; his3M; leu2&0; lys2A0; um3\0; 
YJL088w::kanMX4) was grown in defined minimal C-limiting 
(Fl) medium [18] using 10 g/L of glucose as the sole car- 
bon source. The Fl medium was additionally supplemented 
with 0.5 mM arginine and 1 mM lysine to meet the added 
auxotrophic requirements of the strain. For biological repli- 
cation, four cultures were grown in chemostat mode at a 
dilution rate of 0.1 h _1 and aliquots (15 mL) of the culture 
were centrifuged (4000 rpm; 4°C; 10 min). The supernatant 
was discarded, the pellet flash frozen in liquid nitrogen and 
stored at — 80°C for subsequent protein extraction. Cell counts 
were performed using an automated cell counter (Cellometer 
AUTOM10 by Nexcelom. http://www.nexcelom.com). Pro- 
teins were extracted by re-suspending the biomass pellets in 
250 |jlL of 50 mM ammonium bicarbonate (filter sterilised) 
containing 1 tablet of Roche complete-mini protease in- 
hibitors (+EDTA) (Roche Diagnostics, West Sussex, UK) per 
10 mL of ammonium bicarbonate. Acid-washed glass beads 
(200 |xL) were added. The pellet was subjected to repeated 
bead-beating for 15 bursts of 30 s with a 1 min cool down in 
between each cycle. The biomass was centrifuged for 10 min 
at 13 000 rpm at 4°C; the supernatant was removed and 
stored in low bind tubes on ice. Fresh ammonium bicar- 
bonate (250 (xL) with protease inhibitors was added and the 
pellet re-suspended by vortexing. The bottom of the extrac- 
tion vial was pierced with a hot needle, the vial placed on 
a fresh Eppendorf tube and quickly spun down (5 min at 
4000 rpm at 4°C. The flow-through and the supernatant 
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fraction were combined, the exact volume measured and 
the amount of protein determined by standard assay (Bio- 
Rad Laboratories, Hertfordshire, UK). Protein extracts were 
aliquoted and stored at — 80°C prior to subsequent digestion. 

The targets of QconCAT 1, 2 and 3 differ widely in abun- 
dance, whereas the peptides derived from each QconCAT are 
equimolar. Quantification therefore required multiple ana- 
lytical runs at different loadings of QconCAT to constrain 
quantification to analyte: standard ratios between 10:1 and 
1:10. To achieve this, the digestion strategy employed three 
separate digests for each bioreplicate, one codigest of yeast 
lysate with QconCAT and two unspiked digestions of yeast 
lysate. An amount of yeast lysate representing protein from 
21.5 million cells was dispensed into low-protein-binding mi- 
crocentrifuge tubes (Sarstedt, Leicester, UK) and made up to 
150 u,L by addition of 25 mM ammonium bicarbonate, and, 
in the case of the QconCAT codigests, 7 u.L of QconCAT 
solution. The proteins were denatured using 10 |jlL of 1% 
w/v RapiGest™ (Waters, Manchester, UK) in 25 mM am- 
monium bicarbonate and followed by incubation at 80°C for 
10 min. The sample was then reduced (addition of 10 (xL of 
60 mM DTT and incubation at 60°C for 10 min) and alkylated 
(addition of 10 (jlL of 180 mM iodoacetamide and incuba- 
tion at room temperature for 30 min in the dark). To allow 
quantification of the QconCAT, 10 |jlL of 2.15 pmol/|xL glu- 
fibrinopeptide (Waters) was added to each digest. Trypsin 
(Sigma, Poole, UK, proteomics grade) was reconstituted in 
50 mM acetic acid to a concentration of 0.2 |xg/|jLL and 10 u,L 
added to the sample followed by incubation at 37°C. After 
4.5 h an additional 10 (jlL of trypsin was added and the diges- 
tion left to proceed overnight. The digestion was terminated 
and RapiGest™ removed by acidification (3 |xL of TFA and 
incubation at 37°C for 45 min) and centrifugation (15 000 x 
g for 15 min). To check for complete digestion and to quan- 
tify the QconCAT, each digest was analysed by LC-MS us- 
ing a nanoAcquity UPLC™ system (Waters) coupled to a 
Synapt™ G2 mass spectrometer (Waters) in MS E mode and 
searched against a sequence database (see Supporting In- 
formation). The QconCAT was quantified by integrating the 
peaks generated from XIC of m/z 785.8 (internal standard 
glu-fibrinopeptide) and m/z 788.8 (glu-fibrinopeptide from 
QconCAT digestion). 

2.2 MS and data analysis 

The database search results corresponding to the CopyCAT 
were converted into a spectral database using Skyline [19] and 
the four most intense fragment ions were selected as putative 
transitions. The [ 12 C S ] lys, arg analyte and [ 13 C 6 ] lys, arg stan- 
dard equivalents of these four transitions were tested by appli- 
cation to both yeast only and yeast-QconCAT digest samples. 
SRM analysis was performed using a nanoAcquity UPLC™ 
system (Waters) coupled to a Xevo™ TQ triple quadrupole 
mass spectrometer (Waters) (see Supporting Information). 
The three transitions with the greatest S/N were selected for 



the final quantification analysis and optimised for collision 
energy and cone voltage. For final quantification, samples 
containing the protein equivalent of 200 000 cells with either 
0.2, 2 or 20 fmol of QconCAT were analysed by the previously 
described SRM methodology. The dilution was prepared by 
serial dilution of the yeast-QconCAT codigest by the unspiked 
yeast digest - this reduced the complications of absorption 
of diluted standard peptides and ensured that the QconCAT 
peptides, regardless of concentration, were maintained in an 
abundant peptide environment. A set of decoy transitions 
were created according to [20] and run against the yeast di- 
gest under identical instrument parameters. 

Label-free analysis was performed using a 'Top3' method- 
ology [21] on a yeast digest not containing QconCAT. The 
label-free analysis was performed on two platforms, an ion- 
mobility coupled data independent (HDMS E ) method on 
Synapt™ G2 (as previously described) and a data-dependent 
method on a Q-Exactive™ (Thermo Scientific, Hemel Hemp- 
stead, UK). HDMS E acquisition was performed as [22]. For 
the data-dependent method a portion (4 (jlL) of each yeast 
digest (100 000 cells/(jiL) was mixed with 1 |xL of standard 
protein (50 fmol/ixL rabbit muscle glycogen phosphorylase 
MassPREP™ Digestion Standard, Waters). The resulting 
spiked digests were analyzed by LC-MS using a Ultimate 
3000 RSLC™ nano system (Thermo Scientific) coupled to 
a Q-Exactive™ mass spectrometer (Thermo Scientific) (see 
Supporting Information). The data were processed with Pro- 
genesis (version 4, Nonlinear Dynamics, Newcastle upon 
Tyne, UK). Samples were aligned according to retention time 
using a combination of manual and automatic alignment. 
Default peak picking parameters were applied and features 
with charges from 1 + to 4 + featuring three or more isotope 
peaks were retained. Database searching was performed us- 
ing Mascot (Matrix Science, London, UK) (see Supporting 
Information). These identifications were imported into Pro- 
genesis and the resulting feature set was exported to Progen- 
PostProcessor [23] which can produce 'Top N' quantification 
values from Progenesis feature files. Based on the glycogen 
phosphorylase standard, Top 1, Top 2 and Top 3 based quan- 
tification was performed (depending on the number of pep- 
tides observed per protein, 75% of proteins were Top3) and 
the highest possible N used for final quantification. We refer 
to this dataset as Q-Exactive Hi3. 

2.3 QconCAT Data processing and analysis 

QconCAT quantification values were produced using the 
mProphet package [20]. Decoy transitions were generated for 
each equivalent target transition using the provided mGen.pl 
Perl script using the SPIKE_IN workflow option. Waters .raw 
files were converted to the required .mzXML format using 
wolf-MRM (http://tools.proteomecenter.org/software/wolf- 
mrm/wolf-mrm.zip). The .mzXML files for both target and 
decoy data were merged using an in-house Perl script to align 
the acquired target and decoy transitions. Merged .mzXML 
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files were passed through mMap.pl using -mach TSQ op- 
tion with the transition list output produced using mGen.pl. 
The mMap.pl output was then submitted to mQuest.pl using 
an optimised parameter file (see Supporting Information). 
The output files from mQuest.pl were then submitted to 
mProphet.pl to produce the final light-heavy (target- standard) 
ratios and calculate-associated false discovery rates for all pep- 
tide quantifications. Final copies per cell (cpc) were then cal- 
culated from the analyte: standard ratio, known concentration 
of spike-in heavy standard and the number of cells loaded onto 
the column (obtained from a cell count performed on each 
sample). Peptide quantification values were then used only 
for those where at least three out of four biological replicates 
passed at 1% false discovery rate and all of which had an S/N 
>5. 

3 Results and discussion 

3.1 Absolute quantification of 63 chaperones 

Using our QconCAT methodology [24], as part of a larger 
scale project to quantify in absolute terms the entire cellular 
proteome of S. cerevisiae [25], we designed three recombinant 
proteins constituting concatamers of surrogate peptides to en- 
able the quantification of the 63 annotated chaperones in the 
yeast genome. These recombinant proteins were expressed 
in media supplemented with stable isotope labelled amino 
acids to generate 'heavy' QconCAT proteins that were then 
spiked into yeast protein samples to enable absolute quantifi- 
cation in copies per cell, across four biological replicates. Two 
peptides were nominated to quantify each protein. For each 
peptide, cpc values were averaged across replicates where 
available. Protein abundances were then calculated from the 
two mean values, taken as the maximum of the two pep- 
tides. Such protein quantifications are termed Type A, where 
acceptable data are available for both analyte and surrogate 
peptides. In other instances, the analyte quantotypic peptide 
was not observed although the QconCAT peptide was (Type 
B), and in a few, neither peptide extracted ion chromatograms 
(XICs) were observed above the minimum S/N (Type C). A 
complete list of the chaperone proteins with cpc is in Table 1 
and details of peptides selected and individual peptide-level 
cpc values are in Supporting Information Table SI. Using 
our standard QconCAT method we obtained absolute quan- 
titative values in cpc for 51 of the 63 chaperones. We up- 
dated our data analysis pipeline to incorporate the mProphet 
software [20], which supports semi-automated detection and 
quantification of chromatographic peaks and provides false 
discovery rate analysis from decoy transitions. Full details 
on the data processing to derive cpc are given in Section 2. 
Figure 1 shows representativeXICs for three transitions anal- 
ysed for the peptide IYEQEFGTTK from Cct5 (further exam- 
ples are provided in Supporting Information Fig 1A and B). 
This is a representative Type A quantification where good re- 
peatability is apparent across four bioreplicates. We were able 

© 2013 WILEY- VCH Verlag GmbH & Co. KGaA, Weinheim 



to quantify chaperones across a broad dynamic range from 
250 to 440 000 cpc. There is good coverage of all the chaperone 
classes (including the CCT proteins that have proven refrac- 
tive to epitope-tagging strategies) and we have succeeded in 
quantifying some proteins that other label-mediated strate- 
gies have missed (e.g. HSP70 protein Ssa3). In part, this is 
due to our targeted approach that selects unique peptides 
wherever possible, avoiding peptides shared between several 
closely related proteins. We are also able to offer an upper 
limit on three SMALL class chaperones that no other method 
has yet reported, according to PaxDB [26]. 

The QconCAT approach benefits from inclusion of at least 
two peptides per protein that were observed via LC-MS for 
most of our targets, allowing comparison of sibling peptides 
from the same parent protein (Fig. 2). With few exceptions, 
there was good agreement between sibling peptides. The 
chaperone protein group was one of our first assemblies of 
designed QconCAT peptides and in retrospect, there were 
sub-optimal choices for some standard surrogate peptides. 
We have become increasingly aware of the importance of 
poor cleavage contexts in the standard of the endogenous 
yeast protein (Fig. 2) and we have observed the attendant 
missed cleavage peptides in subsequent MS analyses. These 
are typified by cases where dibasic sites or acidic amino acids 
lie close to or span the desired cleavage site (e.g. RR, KK, KR, 
RK, etc.), which are now deprecated in QconCAT design, and 
which should also inform any selection of proteotypic pep- 
tides [27, 28]. The peptide pairs in Fig. 2 are ordered to place 
the higher of the two abundance values on the %-axis. In all 
cases where one of a sibling pair is subtended by a missed 
cleavage site, its cpc quantification was lower than its partner 
because a significant fraction of the analyte peptide signal was 
lost to the miscleaved peptides that would manifest different 
retention times, masses and transitions. Some poor cleavage 
contexts are highlighted in Fig. 2, such as peptide 1 that is 
subtended by KK at the N-terminus in the parent protein. 

3.2 Comparison with other protein quantification 
methods 

As others and we have previously noted, there is a consid- 
erable discrepancy in the quantified abundance of proteins 
determined by alternative technologies. In Fig. 3 we compare 
the values obtained from our label-mediated QconCAT ap- 
proach to other MS-based studies and to epitope-tagging ap- 
proaches; Scatterplots for all pairwise comparisons were also 
generated (Supporting Information Fig. S2). The dendogram 
is calculated from transformed parts per million (ppm) values 
that are used to calculate Spearman rank correlations between 
pairwise chaperone datasets (see Section 2 and Supporting In- 
formation Table S2). Interestingly, the technically related ap- 
proaches cluster together with epitope-tagging and MS-based 
methods found in two independent clades. Within the MS 
clade, the two label-mediated approaches show the highest 
correspondence, from our own studies and the SILAC study 
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Table 1. Yeast chaperones and absolute abundance expressed in copies per cell (cpc). UniProt, SGD and systematic ORF names are 
provided along with QconCAT-based quantification. The quantification type refers to the status of the endogenous peptide and 
surrogate peptide XIC via mProphet analysis, classifying cases where the endogenous peptide is not observed above the S/N at 
any of the three spike-in concentrations as Type B, and cases where neither peptide is observed as Type C 
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20 000 
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A 
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A 
A 


1 
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± 
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A 
A 


CCT 


Cct5 


YJR064W 


P40413 


6300 


± 


150 


A 


CCT 


Cct8 


YJL008C 


P47079 


6000 


± 


100 


A 


CCT 


Cct7 


YJL111W 


P42943 


2900 


± 


100 


A 


AAA+ 


Hsp104 


YLL026W 


P31539 


27 500 


d= 


2000 


A 


AAA+ 


Hsp78 


YDR258C 


P33416 


10 500 


± 


500 


A 


AAA+ 


Mcx1 


YBR227C 


P38323 


1200 


± 


250 


A 


HSP70 


Ecm10 


YEL030W 


P39987 


<600 






B/B 


HSP40 


Cwc23 


YGL128C 


P52868 


<600 






B/B 


HSP40 


Jjj2 


YJL162C 


P46997 


<6000 






B/B 


HSP40 


Jjj3 


YJR097W 


P47138 


<600 






B/B 


HSP40 


Jidl 


YPR061C 


Q12350 


<600 






B/B 


HSP40 


Hljl 


YMR161W 


P48353 


<600 






B/C 
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Table 1. Continued 



Chaperone class 


SGD gene name 


Systematic name 


UniProt 


QconCAT (CPC ± SEM) a) 


Type 


HSP40 


Apj1 


YNL077W 


P53940 


<6000 


B/C 


CCT 


Cct3 


YJL014W 


P39077 


<600 


B/C 


SMALL 


Sno4 


YMR322C 


Q04902 


<600 


B/C 


SMALL 


Hsp33 


YOR391C 


Q08914 


<600 


B/C 


SMALL 


Hsp32 


YPL280W 


Q08992 


<600 


B/C 


HSP40 


Ydj1 


YNL064C 


P25491 




C/C 



a) An upper quantification limit can be placed for proteins with B-type peptides where no A-type peptides were available. 
- indicates no quantification. 

Errors across four biological replicates are expressed as SEM. All cpc and error values were rounded (to nearest 500 for cpc > 10 000, 
nearest 50 for cpc > 1000 and nearest 10 for lower cpc values). 



from de Godoy and colleagues [7]. It is perhaps notable that 
different growth conditions do not produce apparent greater 
variation than different methodologies as shown by the close 
similarity between the GFP-tagging datasets for yeast grown 
in YPD and sucrose-deficient media [9]. Naturally, some dif- 
ferences will also be expected between the yeast strains, as 
well as growth conditions. Our label-free (Q-Exactive Hi3 
DDA) and label-mediated (QconCAT SRM) analyses also yield 
different chaperone cpc values despite being performed on 
identical yeast samples (strain and growth conditions), since 



they cluster to different regions of the dendogram. Harmon- 
isation of the different quantification methods still seems to 
be out of reach. 

This comparison conceals some of the detail of the various 
studies, none of which were able to obtain values for all 63 
yeast chaperones. Indeed as noted above, the CCT complex is 
poorly represented in the eptitope tagging studies [8,9] though 
it is present in most of the MS-based quantification studies. 
Figure 4 shows the eight major CCT proteins quantification 
values, expressed in ppm (for the purposes of this paper 1 



Replicate 1 



Replicate 2 



Replicate 3 



Replicate 4 



-lYEQEFGTTK y7t»I*t 
-lYEQEFGTTK yS tare 
-lYEQEFGTTK y9 Ul 




-lYEQEFGTTK y7 tarfM 
-IYEQEFGTTK y8tarf« 
-lYEQEFGTTK y9 !ar s « 
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- lYEQEFGTTK y8 t. 
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-lYEQEFGTTK y9 rel 
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-lYEQEFGTTK yS ref 
-lYEQEFGTTK y9ref 




Figure 1. Extracted XICs from mProphet for three selected transitions for both light (yeast analyte) and heavy (QconCAT surrogate) peptide 
sequence lYEQEFGTTK. Grey boxes represent the peak group from which areas are calculated for quantification. All four bioreplicate 
XICs are shown displaying excellent reproducibility. The parent protein, CCT5, was not observed in either of the epitope-tagging datasets 
available [8,9]. 
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Protein Abundance as Measured by Peptide 1 (cpc) 

Figure 2. Sibling peptide XV-plot for all chaperone proteins quan- 
tified using two surrogate QconCAT peptides. The most abun- 
dant of the pair is along the x-axis. Large discrepancies be- 
tween sibling peptide pairs can in part be explained by poor 
cleavage contexts. Peptide-1: The lower abundance sibling (Ssa1 : 
AEETISWLDSNTTASK) for this protein contains a dibasic (KK) 
tryptic site at the N-terminus, of which a missed cleavage vari- 
ant ([K] AEETISWLDSNTTASK) has been observed in subsequent 
mass spectrometric analysis. Additional signal loss may be at- 
tributed to the acidic residues surrounding the C-terminal cleav- 
age site (Glu-Glu), which are known to increase the propensity 
of a missed cleave [43]. Peptide-2: The peptide LNDAVEYVSGR 
from Hsp12 has an interspersed dibasic N-terminal tryptic site 
(KSK) that would affect proper cleavage [27,28]. Peptide-3: The 
peptide LAAEDYIGSAVKfrom Ssz1 with lowerabundance may be 
attributed to missed cleavage via an interspersed dibasic tryptic 
site at the N-terminal and glutamic acid immediately proceeding 
the C-terminal tryptic site. 



tional protein tagged on to it [29] and is believed to mediate the 
folding of a significant fraction of all cytosolic proteins [30]. 
Furthermore, the CCT proteins are all classed as essential by 
saccharomyces genome database (SGD) [31, 32] suggesting 
that techniques that perturb the structure or folding of such 
proteins could be lethal. It is also interesting to note that this 
class of chaperone shows the lowest CV of the abundance 
values across all the classes, given that it is an octamer with 
one:one stoichiometry [29]. This result is independent of the 
method of quantification (see Supporting Information Table 
S3). Comparing the quantifications across datasets, our most 
recent analysis using label-free methods on the Q-Exactive 
yields a stoichiometry closest to one:one for this chaperone 
class. 

Finally, it is evident that the different quantitative ap- 
proaches report different overall estimates of protein abun- 
dance, as epitomised by Fig. 4. Although it is difficult to di- 
rectly compare the methods since they are converted to ppm 
values in PaxDB with somewhat arbitrary conversion factors, 
we can make direct comparisons of those reporting absolute 
quantitative values in cpc. In this case, the mean epe values for 
all chaperones are 36 000, 52 000, 42 000 and 113 000 for the 
QconCAT, TAP-tagging, HDMS E and Q-Exactive Hi3 label- 
free methods, respectively. The QconCAT data is lowest; this 
partly reflects the high sensitivity of this technique to report 
quantification via SRM at < 1000 cpc values for additional 
chaperones, lowering the overall average. For example, five 
of the HSP40 class of chaperones are reported at less than 500 
cpc using our QconCATs but are absent from the HDSM E 
and Q-Exactive Hi3 label-free quantifications entirely. It is 
also possible that epitope tagging can induce over-expression 
in some cases, which could lead to inflated cpc values. 



ppm = 60 cpc), extracted from the PaxDB database [26]. Only 
Cct4 has been quantified by any of the epitope-tagging strate- 
gies to-date, while all of the MS-based methods have good 
success with all proteins in this complex. This is perhaps un- 
surprising since the CCT complex is formed by a heteromeric 
8-mer ring structure which could be perturbed by any addi- 



3.3 Relationship between chaperone and target 
abundance 

The chaperones mediate protein folding in the cell, regu- 
lating the stability and preventing mis-folding for a large 
component of the genome. Gong and colleagues [11] 
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Figure 3. Dendogram of chaper- 
one quantification values from 
different experimental method- 
ologies. All but the QconCAT, 
Q-Exactive Hi3 and HDMS E 
datasets were obtained from 
PaxDB [26]. All protein abun- 
dance values are expressed 
as ppm for comparisons with 
PaxDB data, assuming 60 mil- 
lion protein cpc in total for 
the QconCAT, Q-Exactive Hi3 
and HDMS E datasets. Primary 
references for the independent 
datasets are: Ghaemmaghami 
[8], Newman SD and YEPD [9], 
and de Godoy [7]. 
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Figure 4. Protein quantification of the members of the CCTchap- 
erone complex by alternative experimental strategies. The CCT 
complex is formed by an 8-mer of subunits forming a ring struc- 
ture and is expected to retain a 1:1 stoichiometry between sub- 
units. All protein abundance values are expressed as ppm for 
comparisons with PaxDB data, assuming 60 million copies in total 
forthe QconCAT, Q-Exactive and HDMS E datasets. The absence of 
measured quantification values is represented by a star symbol. 



identified 4340 candidate substrates from their tandem affin- 
ity purification strategy that represents the majority of the 
yeast proteome. This qualitative map of the yeast chaper- 
ome network however lacks quantification, and here we add 
our own and published yeast protein abundance values to 
consider the folding 'workload' undertaken by each chaper- 
one or its class. Additionally, given the tendency of affinity 
pull-down data to contain false positives, we have generated 
an additional 'high-quality' dataset of the chaperone interac- 
tome, comprising a total of 3649 interactions over 60 chap- 
erones. All chaperone interactions were obtained from three 
public protein-protein interaction datasets (BioGRID [33], 
MIPS [34] and STRING [35]). These were further filtered by 
excluding chaperone-chaperone interactions, retaining inter- 
actions where the reciprocal interaction was also observed 
(within a dataset) and in the case of STRING the interaction 
confidence score was >0.7 (high-confidence). The three fil- 
tered datasets were then combined to produce a 'high-quality' 
dataset, covering 60 chaperones, 1711 substrates and 3649 
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Chaperone Abundance (cpc) 

Figure 5. Number of chaperone-targets versus chaperone abun- 
dance. The number of non-chaperone substrates (or targets) is 
plotted against the abundance of each chaperone (in copies per 
cell units) for the QconCAT quantification dataset, against num- 
ber of chaperone targets. A moderate correlation is observed with 
the more abundant chaperones having more interacting proteins 
(R sp : 0.49, p= 0.00002). 



interactions. Analysis of the overlap between the datasets 
showed that MIPS provided no information additional to the 
combination of BioGRID and STRING (see Supporting 
Information Fig. S3A-C). The filtering process retained 
all but one of the complexes exhibiting reciprocity as re- 
ported by Gong and colleagues (see Supporting Information 
Fig. S3D). 

We next considered the relationship between chaperone 
abundance and the number of interactors, reasoning that 
chaperones with high numbers of targets ought to be abun- 
dant. Although no previous correlation had been noted [11], 
a modest but significant positive correlation is observed here 
(Fig. 5), for the high-quality filtered interaction set. This cor- 
relation is observed independently of the method used to 
quantify the chaperones, generating a Spearman Rank cor- 
relation statistic of 0.08-0.54 (see Supporting Information 
Table S4). Simplistically, there is a correlation between the 
qualitative workload of a chaperone (as measured by the num- 
ber of interactors) and its abundance in the cell. In this case, 
we have not included the chaperone-chaperone interactions 
as substrates since many will be co-chaperone interactions 
and do not strictly represent substrates. The correlation is 
also observed for the unfiltered complete set of interactions 
reported by Gong and colleagues [11], and some of the cor- 
relations observed are slightly stronger (Spearman Rank cor- 
relations between 0.48 and 0.74). Figure 5 also breaks down 
the chaperones into classes, showing that the generally low 
abundance HSP40 class have relatively few substrates while 
the generic 'promiscuous' chaperones with many substrates, 
such as the SMALL class and Hsp70 members (including 
Ssal), are high in abundance. 

These analyses only consider the number of interactors, 
not the total 'volume' of protein folding/translocation being 
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mediated by a given chaperone. Accordingly, for each chaper- 
one (c), the volume of protein (V c ) was calculated as the total 
abundance (cpc n ) of all n substrates. 

Vc = X>pc n 

l 

However, the volume does not directly consider the ac- 
tual folding workload placed on chaperones which is influ- 
enced by rate of substrate turnover. To account for this, we 
use measured protein degradation rates fc^g [6] to estimate a 
synthesis rate, lc syn , for individual chaperone targets, as shown 
below. 

^syn = CpC n X fe^eg 

We can then sum these values to estimate the total work- 
load or flux F c (in molecules per minute) handled by an indi- 
vidual chaperone or chaperone class. 

n 

F -Vfe 

1 c — / , ^syn 
1 

These values are based on some assumptions. We assume 
that protein abundances are in steady state ( = 0) and the 
rate of synthesis captures the total flux dealt with by the at- 
tendant chaperones; we split flux on a pro rata basis across 
chaperones when there are multiple 'parents'. Missing fe^eg 
were substituted by the geometric mean across the entire 
dataset for instances where no turnover data was available for 
some substrates. We have also ignored growth rates, since 
our raw quantification and half-life data come from different 
sources of yeast experiment. Effectively, this adds a constant 
to fcdeg values, akin to the dilution rate in a controlled culture 
system, which would alter our estimated fluxes. Neverthe- 
less, we believe that despite the limitation of the data and 
assumptions, these values represent the most accurate cur- 
rent estimates of protein flux available that are, importantly, 
also comprehensive. We refer the interested reader to a re- 
view article that addresses several key issues and challenges 
in measuring protein turnover, which develops these ideas 
further [36]. 

Here, and for all further calculations, we use the filtered 
high-quality set of 3649 interactions rather than the complete 
set from Gong and colleagues [11]. We do not have a genome- 
wide SRM-based quantitative dataset from QconCATs and 
therefore use the more comprehensive datasets from PaxDB 
and our own label-free data to capture substrate abundance. 

When considering either the total chaperone substrate 
abundance V c , or more formally, the total chaperone sub- 
strate flux F c , the correlation is maintained or strengthened. 
The Spearman correlation coefficients range from 0.26 to 
0.69, for the comparison of chaperone abundance with flux 
as shown in Fig. 6. This demonstrates that chaperones medi- 
ating the folding of a large flux of expressed protein are them- 
selves high-abundance proteins. The full set of both Pearson 
linear and Spearman Rank correlations are provided in Sup- 



porting Information Table S4, which highlights that signifi- 
cant correlations are obtained for both volume and flux. We 
also see similar correlations when using the larger unfiltered 
protein interactions set (data not shown). 

Figure 6 also clearly illustrates the reduced sensitivity of 
several label-free and epitope-tagging methodologies which 
have failed to quantify proteins (chaperones or substrates) 
below 1000 cpc, unlike the MS-based methods such as SILAC 
or the QconCAT approach presented here. 



3.4 Chaperone workload and efficiency 

As previously discussed, we can estimate the total chaper- 
one 'workload' and the attendant 'efficiency' of individual 
chaperones and chaperone classes, defining the workload as 
the total substrate flux (F c ). This definition, considers the 
cpc per minute estimated from the protein abundance and 
half-life as a proxy for the total amount of work a chap- 
erone must undertake. Although this approach has some 
caveats, and does not directly factor in growth, it does pro- 
vide a broad snapshot of the overall volume of protein folding 
meditated by individual chaperones. Table 2 shows the top 
15 ranked chaperones by workload, calculated and ranked 
for the HQ filtered set of 3649 interactions. The proteins at 
the top of the list agree with expectation, containing 'gen- 
eralist' chaperones that interact with significant fractions of 
nascent polypeptide chains close to or at the ribosome as 
part of the ribosomal-associated complex (RAC), or traffick- 
ing from Hsp70s and Hsp40s to Hsp90s. As would be ex- 
pected, Table 2 is dominated by these chaperone classes, 
including RAC members Ssbl/2, other cytosolic Hsp70s in- 
cluding Ssal, and nucleotide exchange factors such as Ssel 
that are involved in trafficking onto the Hsp90 complex. It 
has been suggested that 20% of nascent polypeptide chains 
pass through the RAC/Hsp70/Hsp40 route [12] consistent 
with the high workload we observe here. Our estimate of 
over 1 million molecules per minute passing through the 
ribosome-associated Ssbl, suggests that each Ssbl protein 
handles around 14 molecules per min. Similar statistics are 
observed when the unfiltered list of candidate substrates is 
used (Supporting Information Table S5), though the esti- 
mated flux per chaperone increases to 25 molecules/min. 

Table 2 raises the question as to whether there are chap- 
erones (or chaperone groups) that are extremely (in) efficient 
in terms of the protein folding they mediate? Are there chap- 
erones that despite being low abundance, mediate folding 
of a high-protein flux (high efficiency), or vice versa (low 
efficiency). This is an important concept, since not all chap- 
erone proteins are expected to be directly and singly respon- 
sible for the folding of their targets [15, 37]. To address this, 
the total substrate flux F c can be normalised by dividing by 
the chaperone's own abundance to calculate a workload effi- 
ciency. These are plotted as box-and-whisker plots in Fig. 7, 
on a log2 scale, grouped by chaperone class, using the PaxDB 
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Figure 6. Analysis of chaperone abundance versus total target flux, F c . For each quantification dataset the chaperone protein abundances 
were plotted against the total flux F c (workload) calculated from all its interacting proteins (where abundance values were obtained). 
Correlations are observed for each dataset, ranging from 0.26 to 0.69, which are all significant (p < 0.05) except the 0.26 obtained for 
HDMS E . For all but the Q-Exactive Hi3 and HDMS E datasets, protein abundance data was taken from PaxDB [26], assuming 1 ppm = 60 cpc. 



values from the de Godoy analysis to represent chaperone and 
target abundance [7]. Very similar results are obtained when 
using any of the other comprehensive yeast protein abun- 
dance datasets (data not shown). A considerable variation is 
observed across the chaperones and their classes with appar- 



ently super-efficient chaperones such as Hsp70-Ssb2 (with 
log2 efficiency of >8) through to the apparently under-used 
chaperones such as the Hsp40-Djpl (with a log efficiency 
of <— 4). The latter is one of the regulatory J-domain chap- 
erones which promotes the ATP-ase activity of Hsp70s and 



© 2013 WILEY- VCH Verlag GmbH & Co. KGaA, Weinheim 



www.proteomics-journal.com 



1286 P. Brownridge et al. Proteomics 201 3, 13, 1276-1291 



Table 2. Overall folding workload for top15 chaperones ranked by total substrate flux 
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Ssb2 


YNL209W 


HSP70 


85 650 


172 


13 675 400 


1 172 700 


Ssb1 


YDL229W 


HSP70 


68 570 


348 


9 666 400 


743 300 


Ssz1 


YHR064C 


HSP70 


63 090 


115 


6 167 800 


423 800 


Ssc1 


YJR045C 


HSP70 


58 050 


40 


2 802 400 


346 700 


Ssa4 


YER103W 


HSP70 


5700 


29 


2 261 500 


323 800 


Zuo1 


YGR285C 


HSP40 


31 290 


89 


4 628 100 


314 100 


Hsc82 


YMR186W 


HSP90 


84 370 


90 


3 853 900 


274 600 


Ssa1 


YAL005C 


HSP70 


335 460 


179 


3 210 800 


247 100 


Cct2 


YIL142W 


CCT 


7070 


29 


1 080 700 


159 000 


Cct4 


YDL143W 


CCT 


20 010 


32 


1 044 700 


1 50 000 


Hsp60 


YLR259C 


HSP60 


46 960 


30 


1 083 200 


122 800 


Sse1 


YPL106C 


HSP70 


61 980 


79 


2 408 400 


120 300 


Hsp82 


YPL240C 


HSP90 


23 040 


89 


1 338 500 


103 000 


Kar2 


YJL034W 


HSP70 


26 300 


66 


935 100 


96 300 


Gim3 


YNL153C 


PFD 


4060 


120 


1 162 300 


95 400 



a) Chaperone cpc values were taken from our QconCAT yeast quantification. 

b) As determined from the high-quality filtered subset of 3649 interactions. 

c) Substrate abundance values were taken from PaxDB for the SILAC-based yeast quantification determined by de Godoy and colleagues [7] 
and converted to cpc assuming 60 million protein molecules per cell. 

d) Flux was calculated as the sum over all chaperone substrates of the product of total substrate volume and the degradation rate, the 
latter obtained from the half-life study of Belle et al. [6]. 



hence may not be directly interacting with true substrates; 
indeed, it is more abundant than the summed abundance of 
all its non-chaperone substrates. Interestingly, the prefoldins 
(PFD) as a class are the most efficient, an observation which 
is conserved irrespective of the quantification method used to 
calculate the values (Supporting Information Fig. S4). These 
proteins form a heterohexameric complex that deliver nascent 
chains of TRiC/CCT substrates to this complex for refolding, 
including key structural proteins such as actin and tubulin. 

We can also consider the total volume and workload di- 
vided amongst the different chaperone classes, aggregating 
the abundance or flux of the targets to each class. If we as- 
sume that proteins not classed as chaperone targets can fold 
independently, we can consider the fraction of proteins (or 
protein abundance or flux) that is mediated by chaperone in- 
teractions in yeast (Fig. 8). Only 36% of proteins have been 
classed as chaperone targets but they constitute the majority 
(57%) of all protein volume in the cell, which itself represents 
62% of the total flux of protein synthesis. Figure 8 also shows 
how the relative workload changes when we consider the flux 
weighted fraction compared to just the number of targets for 
each chaperone class (note: we share flux across classes pro 
rata when a substrate's folding is mediated by more than one 
chaperone from different classes). This shows that although 
the Hsp70 class acts on fewer than 50% of known chaper- 
one substrates, these proteins account for over 70% of the 
total chaperone-mediated protein synthesis flux in the cell. A 
similar small expansion is shown for the Hsp90 class while 
most others fall or stay roughly constant. Indeed, despite 
their efficiency and essentiality, the PFD and CCT chaper- 
ones mediate folding of less than 2 and 4%, respectively, of 
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Figure 7. Relationship between chaperone abundance and nor- 
malised flux via each class. Box-and-whisker plots show the range 
of chaperone efficiency (flux/chaperone abundance) values cal- 
culated for individual chaperones in each class. The bold line 
shows the median value for each class, with boxes represent- 
ing the inter-quartile ranges and whiskers the full extent of the 
minimum/maximum values in the normal way. The classes are 
ordered on median value from left to right. All workload effi- 
ciency values were calculated using the data from de Godoy and 
colleagues [7], normalising the total protein flux F c by the chap- 
erone cpc value, and then converting to a log2 value. 
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Figure 8. Overall chaperone workload in yeast. The 
two sets of pie charts consider the overall work- 
load for chaperone classes in yeast, calculated by 
the number of proteins whose folding they medi- 
ate or scaled by the total protein flux for which they 
are responsible. In both cases, substrate counts or 
flux are shared pro rata between classes when more 
than one chaperone from different classes acts upon 
them. The two left most charts show how the amount 
of protein whose folding is regulated by chaperones 
in yeast is expanded when considered by total pro- 
tein flux, and similar differences are observed when 
considered the breakdown of the data into chaperone 
classes on the right. 



chaperone-mediated protein flux. The latter estimate is lower 
than previously reported values of ~10% for CCT [30, 37], 
though we note our value is likely to be an underestimate 
given that we filter the interacting substrate set. Despite this, 
the CCT class in particular is enriched in targets which are 
themselves deemed essential (Supporting Information Fig. 
S5). Around 50% of the protein abundance or flux regulated 
by CCT is annotated as 'essential' by the yeast genome dele- 
tion project [31], more than for any other chaperone class. 

Integrating the abundance data with annotated essential- 
ity, we can calculate how much of the yeast proteome is 
chaperone-mediated and deemed essential. The folding of 
approximately ~9% of known ORFs are chaperone mediated 
and classed as essential, but this doubles to ~18% when 
considered as a fraction of total protein flux, reinforcing the 
key role played by chaperones. Nevertheless, ~16% of es- 
sential protein synthesis apparently requires no chaperone 
mediation. 

Using the abundance estimates from different experimen- 
tal approaches, we can estimate the total throughput of dif- 
ferent chaperone classes with respect to the total protein in 
the cell (Supporting Information Table S6). Around 6.5% of 
all protein flux passes through the PFD/CCT route, compa- 
rable to the previous estimates of ~10% [12, 30, 38] in both 
yeast and bacterial systems. Similarly, around 7% of all pro- 
tein folding is mediated by Hsp90s (a slightly lower estimate 
than the 20% previously reported). However, our data esti- 
mates that 44% of all protein synthesis passes through the 
Hsp70 class including that allied to the RAC, Hsp70, Hsp40 
route, which suggests previously reported values of 20% may 
be under-estimates [38, 39]. 



3.5 Chaperone target sub-cellular localisation 

Since the sub-cellular localisation of the majority of yeast 
proteins has been characterised by a variety of studies, 
notably [5], we can scale the proteins in different locations 
by their abundance rather than just the number of different 
protein species. This is shown in Fig. 9 for one of the 
more comprehensive quantification datasets [7]. Most of the 
cellular proteome is in the cytoplasm, as would be expected, 
followed by the membrane and mitochondrion, whether 
considered by proteins annotated in the genome or scaled 
by abundance. However, the cytoplasmic, membrane and 
mitochondrial protein abundances are expanded versus 
expectation (i.e. the total copy number of cytosolic proteins 
is larger than the total number annotated as such in the 
genome). Conversely, the nuclear protein abundance is 
reduced compared to expectation. This makes sense since 
despite the large number of proteins encoded in the genome 
destined for the nucleus, many of them will have signalling 
and regulatory roles (e.g. transcription factors) and would not 
be expected to be continually expressed or high abundance. 

When the chaperone-mediated substrates are analysed by 
class, different distributions of substrate location emerge. 
The SMALL and PFD class both have quite different 
distributions to that for all proteins in Fig. 10, specialising in 
several non-cytosolic categories. Similarly, PFDtargets con- 
stitute a large fraction of protein destined for the vacuolar 
membrane, nucleus and ER. We also noted a small expan- 
sion in the mitochondrial class for Hsp90s, which concurs 
with the module assigned to Hsc82 by Bogumil and col- 
leagues that also includes known mitochondrially active yeast 
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Figure 9. Sub-cellular disposition of the yeast proteome. The pie charts show the total proteome split into component sub-cellular 
localisations, annotating proteins from Huh et al. [5]. In A, the number of proteins is used as a fraction of the total, while in B this is 
scaled by the total protein abundance. In both cases, proteins assigned to multiple loci and split pro rata between the different locations, 
either by count or by abundance. The differences between A and B highlight the expansions in abundance where the total concentration 
of protein assigned to a specific location exceed the relative number of proteins encoded in the genome. Sub-cellular locations containing 
less than 2% are not labelled for clarity, but also include ER to Golgi, Golgi to ER, Golgi to vacuole, actin, bud, bud neck, cell periphery, 
endosome, lipid particle, microtubule, peroxisome and spindle pole. 



chaperones [16]. Indeed, this enrichment is statistically sig- 
nificant (p < 0.01) as are many of the individual values of 
fractional target protein volume in selected sub-cellular lo- 
cations using the Expression Analysis Systematic Explorer 
(EASE) modified Fisher Exact statistical test [40]. This in- 
cludes the microtubular and mitochondrial targets for the 
PFD, ER, Golgi and nuclear targets for the Hsp40s and cyto- 
plasm for the 'generalist' Hsp70s, all consistent with expec- 
tation in the literature (cf. [41,42]). The full list of p-values 
for each sub-cellular location by chaperone class are listed in 
Supporting Information Table S7. 

The data are consistent with a degree of specialisation for 
different chaperone classes in terms of the sub-cellular des- 
tination of their targets, notwithstanding the fact that the 
majority of their targets are normally cytosolic. Indeed, the 
observed sub-cellular specialisation by target protein volume 
is also statistically significant across the chaperone classes 
(p < 0.001, using Kruskal-Wallis), an observation that is 
clear when the data are viewed from the perspective of 
the sub-cellular localisations (Supporting Information Figs. 
S6 and S7) which highlights, for example, the PFD/CCT 
preference for actins and microtubule proteins such as 
tubulin. 

No clear specialisations for substrate location were re- 
ported in a previous comprehensive study although a 
different approach was taken to the assignment of chaper- 
ones to modules and quantification was not directly factored 
in [16]. 



4 Concluding remarks 

We present here, for the first time, absolute quantification 
of a functional class of the yeast proteome using the Qcon- 
CAT methodology as part of our COPY project. We have 
successfully quantified 51 of the 63 annotated chaperones 
in the yeast proteome by SRM using stable isotope labelled 
surrogate peptides with high precision, achieving quantifica- 
tion down to ca. 250 cpc. The sibling peptides for individ- 
ual proteins show a high level of agreement and we believe 
this represents the most accurate measurement of chaperone 
abundance in yeast. Nevertheless, the comparison with other 
approaches highlights the disparity that still exists in quanti- 
tative proteomics and the variance introduced by the different 
methodologies, which apparently exceeds that occurring bio- 
logically. 

We elected in most cases to use the SILAC-based quan- 
tification data [7] to present aggregated abundance data for 
protein volume, flux, folding efficiency and sub-cellular local- 
isation. This is one of the most comprehensive studies, with 
over 4000 proteins quantified. We reasoned that the relatively 
small number of proteins missed by this set are likely to be 
low abundance or not expressed, and therefore cannot greatly 
skew the analysis. Moreover, this is an MS-based dataset that 
does not involve epitope tagging (which could on occasions 
lead to lethal fusions or over-expression). Nevertheless, the 
global trends are maintained, regardless of dataset used to 
make the calculations (data not shown). 
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Figure 10. Abundance scaled sub-cellular proteomes of the different chaperone classes. The pie charts show the relative fraction of total 
protein volume of the sets of chaperone targets for each chaperone class, split by the sub-cellular localisation of the targets. The plots 
show that different chaperone classes display different preferences compared to the overall trends displayed in Fig. 8, specialising in the 
regulation of folding of proteins destined for different sub-cellular locations. Cytoplasmic proteins dominate all classes, but the SMALL 
class also has a preference towards nuclear and mitochondrial substrates, while prefoldins (PFD) are also linked to vacuolar membrane 
and ER, as well as of their key targets such as actin. 



The availability of quantitative datasets such as these sup- 
ports the estimation of protein synthetic flux (workload) and 
associated efficiency, which coupled to interactome and half- 
life data allows the different chaperones and classes to be 
compared. We report for the first time the direct correlation 
between chaperone abundance and the number, aggregated 
abundance and flux of their targets, an observation which 
makes energetic sense for yeast cells. This extends other re- 
ported data; for example, Bogumil et al. [16] note that AAA+ 
and CCT chaperones substrates are highly expressed and have 
high-codon adaptation indices, as do the chaperone modules 
which contain the chaperones themselves. We see this trend 
extended broadly across all classes, independent of experi- 
mental approach to quantify the proteome. 

We also see evidence that yeast chaperones are biased to 
mediate the folding of substrates from different sub-cellular 
localisations. Although the comprehensive study by Bogumil 
and colleagues [16] highlighted a broad spread of sub-cellular 
locations within chaperone modules they did not observe spe- 



cific module trends. We confirm the observation that chap- 
erones appear to mediate folding across a broad range of 
different sub-cellular localisations, and not just the one(s) 
where they are believed to reside. 

The integrated interactome and quantification data offer an 
alternative method to estimate chaperone substrate through- 
put, which may be compare to previous studies. Our analy- 
ses provide new estimates on the volume of protein folding 
passing for through the different chaperone pathways and 
classes, and we estimate as much as 44% of all protein syn- 
thesis goes through the RAC -induced route via Hsp70s and 
Hsp40s. 

Quantitative proteomics data, especially on absolute con- 
centrations, can provide a new interpretation of affinity pull- 
down inter-actomics data and better characterise molecular 
systems such as the chaperone network. We aim to extend 
these generalised findings to develop our understanding of 
the chaperone network and the dynamics and stoichiometry 
of protein folding and proteostasis in yeast. 
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